Goto

Collaborating Authors

 video recording


REAL: Reading Out Transformer Activations for Precise Localization in Language Model Steering

Zhan, Li-Ming, Liu, Bo, Xie, Chengqiang, Cao, Jiannong, Wu, Xiao-Ming

arXiv.org Artificial Intelligence

Inference-time steering aims to alter a large language model's (LLM's) responses without changing its parameters, but a central challenge is identifying the internal modules that most strongly govern the target behavior. Existing approaches often rely on simplistic cues or ad hoc heuristics, leading to suboptimal or unintended effects. We introduce REAL, a framework for identifying behavior-relevant modules (attention heads or layers) in Transformer models. For each module, REAL trains a vector-quantized autoencoder (VQ-AE) on its hidden activations and uses a shared, learnable codebook to partition the latent space into behavior-relevant and behavior-irrelevant subspaces. REAL quantifies a module's behavioral relevance by how well its VQ-AE encodings discriminate behavior-aligned from behavior-violating responses via a binary classification metric; this score guides both module selection and steering strength. We evaluate REAL across eight LLMs from the Llama and Qwen families and nine datasets spanning truthfulness enhancement, open-domain QA under knowledge conflicts, and general alignment tasks. REAL enables more effective inference-time interventions, achieving an average relative improvement of 20% (up to 81.5%) over the ITI method on truthfulness steering. In addition, the modules selected by REAL exhibit strong zero-shot generalization in cross-domain truthfulness-steering scenarios.


Longitudinal and Multimodal Recording System to Capture Real-World Patient-Clinician Conversations for AI and Encounter Research: Protocol

Zahidy, Misk Al, Maldonado, Kerly Guevara, Andrango, Luis Vilatuna, Proano, Ana Cristina, Claros, Ana Gabriela, Jimenez, Maria Lizarazo, Toro-Tobon, David, Montori, Victor M., Ponce-Ponte, Oscar J., Brito, Juan P.

arXiv.org Artificial Intelligence

The promise of AI in medicine depends on learning from data that reflect what matters to patients and clinicians. Most existing models are trained on electronic health records (EHRs), which capture biological measures but rarely patient-clinician interactions. These relationships, central to care, unfold across voice, text, and video, yet remain absent from datasets. As a result, AI systems trained solely on EHRs risk perpetuating a narrow biomedical view of medicine and overlooking the lived exchanges that define clinical encounters. Our objective is to design, implement, and evaluate the feasibility of a longitudinal, multimodal system for capturing patient-clinician encounters, linking 360 degree video/audio recordings with surveys and EHR data to create a dataset for AI research. This single site study is in an academic outpatient endocrinology clinic at Mayo Clinic. Adult patients with in-person visits to participating clinicians are invited to enroll. Encounters are recorded with a 360 degree video camera. After each visit, patients complete a survey on empathy, satisfaction, pace, and treatment burden. Demographic and clinical data are extracted from the EHR. Feasibility is assessed using five endpoints: clinician consent, patient consent, recording success, survey completion, and data linkage across modalities. Recruitment began in January 2025. By August 2025, 35 of 36 eligible clinicians (97%) and 212 of 281 approached patients (75%) had consented. Of consented encounters, 162 (76%) had complete recordings and 204 (96%) completed the survey. This study aims to demonstrate the feasibility of a replicable framework for capturing the multimodal dynamics of patient-clinician encounters. By detailing workflows, endpoints, and ethical safeguards, it provides a template for longitudinal datasets and lays the foundation for AI models that incorporate the complexity of care.


Transferring Expert Cognitive Models to Social Robots via Agentic Concept Bottleneck Models

Zhao, Xinyu, Tan, Zhen, Enisman, Maya, Seo, Minjae, Durantini, Marta R., Albarracin, Dolores, Chen, Tianlong

arXiv.org Artificial Intelligence

Successful group meetings, such as those implemented in group behavioral-change programs, work meetings, and other social contexts, must promote individual goal setting and execution while strengthening the social relationships within the group. Consequently, an ideal facilitator must be sensitive to the subtle dynamics of disengagement, difficulties with individual goal setting and execution, and interpersonal difficulties that signal a need for intervention. The challenges and cognitive load experienced by facilitators create a critical gap for an embodied technology that can interpret social exchanges while remaining aware of the needs of the individuals in the group and providing transparent recommendations that go beyond powerful but "black box" foundation models (FMs) that identify social cues. We address this important demand with a social robot co-facilitator that analyzes multimodal meeting data and provides discreet cues to the facilitator. The robot's reasoning is powered by an agentic concept bottleneck model (CBM), which makes decisions based on human-interpretable concepts like participant engagement and sentiments, ensuring transparency and trustworthiness. Our core contribution is a transfer learning framework that distills the broad social understanding of an FM into our specialized and transparent CBM. This concept-driven system significantly outperforms direct zero-shot FMs in predicting the need for intervention and enables real-time human correction of its reasoning. Critically, we demonstrate robust knowledge transfer: the model generalizes across different groups and successfully transfers the expertise of senior human facilitators to improve the performance of novices. By transferring an expert's cognitive model into an interpretable robotic partner, our work provides a powerful blueprint for augmenting human capabilities in complex social domains.


Aqara Camera Protect Kit Y100 review: Entry-level home security

PCWorld

The Aqara Camera Protect Kit Y100 is one of the easiest to install and set up tech products I've tested, and it does an outstanding job of monitoring a relatively small space. But steer clear if you're looking for a professional monitoring option, as that's not on offer. But before I get too deep into this review, be aware that Aqara does not offer any professional monitoring service, where someone in a central office monitors your security system and can dispatch first responders in the event of a break-in, fire, or medical emergency. While such plans are always paid subscriptions, its absence here will be a deal-breaker for some (Aqara does manufacture a Zigbee smart smoke detector if self-monitoring is all you're looking for). The Matter-compatible Aqara Camera Hub G3 includes a Zigbee radio and a dual-band Wi-Fi adapter.


Best video doorbells 2025: Reviews and buying advice

PCWorld

Your front door is your home's first line of defense. Having a video doorbell mounted next to that door is almost as important as having a deadbolt, because it will not only give your visitors an easy way to let you know they're there, but it will also know when anyone approaches your home–whether or not you're home at the time. In fact, these cameras are so useful you might want to mount one next to every entry point into your home: side entrances, at your garage door, and the door to your backyard, for example. Whether you're waiting for friends to visit, watching for trouble-makers, tracking parcel deliveries, or hiding from that weird neighbor who keeps asking to borrow your lawn mower, the video doorbell is an essential security tool. TechHive's editors and contributors have been testing video doorbells since 2014, and we continuously evaluate the latest devices along with their accompanying apps.


EufyCam S3 Pro Kit review: Local storage means no subscription

PCWorld

The EufyCam S3 Pro 2-Cam Kit delivers sharp, reliable, and fully independent home security without locking you into ongoing fees. Cloud subscriptions that lock your security camera footage behind a monthly fee are a frustrating reality for homeowners. The EufyCam S3 Pro 2-Cam Kit offers a way out. With 4K video resolution, smart AI detection, and solar panels integrated into the two cameras, it delivers top-shelf performance without roping you into a payment plan. Eufy does offer cloud storage as an option, but the cameras in this offering store their recordings locally on Eufy's HomeBase 3 hub--a NAS box (network-attached storage), essentially--enhancing your privacy while saving you money on subscription fees.


Automatic Classification of General Movements in Newborns

Chopard, Daphné, Laguna, Sonia, Chin-Cheong, Kieran, Dietz, Annika, Badura, Anna, Wellmann, Sven, Vogt, Julia E.

arXiv.org Artificial Intelligence

General movements (GMs) are spontaneous, coordinated body movements in infants that offer valuable insights into the developing nervous system. Assessed through the Prechtl GM Assessment (GMA), GMs are reliable predictors for neurodevelopmental disorders. However, GMA requires specifically trained clinicians, who are limited in number. To scale up newborn screening, there is a need for an algorithm that can automatically classify GMs from infant video recordings. This data poses challenges, including variability in recording length, device type, and setting, with each video coarsely annotated for overall movement quality. In this work, we introduce a tool for extracting features from these recordings and explore various machine learning techniques for automated GM classification.


Steam's game recorder is now available to everyone

Engadget

Steam's Game Recording function has come out of beta and is now available to everyone on Mac, PC and Steam Deck, Valve announced. It provides a native tool to record gaming sessions and also offers basic editing tools to trim clips. Users can either run it in the background or manually start or stop recording. On top of that, there's a replay option that lets you quickly review recent recordings. You can then add markers for key moments, and if the game supports Game Recording's Timeline feature, Steam will add its own markers.


Prime Day drops Google Nest devices to record-low prices

Engadget

Amazon Prime Day is usually a great time to pick up things for your home, particularly smart home tech. This year, a bunch of Google's Nest devices have been discounted, with many down to record-low prices. These gadgets are best for anyone who already lives within the Google ecosystem, especially those who already rely on the Google Assistant to help them get things done. You'll find a few Nest security cameras on sale for Prime Day, as well as video doorbells and Wi-Fi systems. If you're looking for even more Prime Day deals, check out Engadget's Prime Day hub where you'll find all of the best tech deals you can get for the shopping event this year.


Intelligent Interface: Enhancing Lecture Engagement with Didactic Activity Summaries

Wróblewska, Anna, Witas, Marcel, Frańczak, Kinga, Kniaź, Arkadiusz, Cheong, Siew Ann, Chee, Tan Seng, Hołyst, Janusz, Paprzycki, Marcin

arXiv.org Artificial Intelligence

Recently, multiple applications of machine learning have been introduced. They include various possibilities arising when image analysis methods are applied to, broadly understood, video streams. In this context, a novel tool, developed for academic educators to enhance the teaching process by automating, summarizing, and offering prompt feedback on conducting lectures, has been developed. The implemented prototype utilizes machine learning-based techniques to recognise selected didactic and behavioural teachers' features within lecture video recordings. Specifically, users (teachers) can upload their lecture videos, which are preprocessed and analysed using machine learning models. Next, users can view summaries of recognized didactic features through interactive charts and tables. Additionally, stored ML-based prediction results support comparisons between lectures based on their didactic content. In the developed application text-based models trained on lecture transcriptions, with enhancements to the transcription quality, by adopting an automatic speech recognition solution are applied. Furthermore, the system offers flexibility for (future) integration of new/additional machine-learning models and software modules for image and video analysis.